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^ . Abstract 

In this paper, we study a general online linear programming problem whose formulation en- 
(T) \ compasses many practical dynamic resource allocation problems, including internet advertising 

display applications, revenue management, various routing, packing, and auction problems. We 
propose a model, which under mild assumptions, allows us to design near-optimal learning-based 
Tj} ■ online algorithms that do not require the a priori knowledge about the total number of online 

^**^ | requests to come, a first of its kind. We then consider two variants of the problem that relax 

the initial assumptions imposed on the proposed model. 

1 Introduction 



in 
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\q | Online optimization is attracting wide attention from computer science and operations research 

communities. It has many applications, including those dealing with dynamic resource allocation 
problems. In many real- world problems, information about the instance to optimize is not com- 
pletely known ahead of time, but revealed in an online fashion. For example, in typical revenue 
management problems, customers arrive sequentially offering a price for a subset of commodities, 
CN ■ e.g. multi-leg flights. The seller must make irrevocable decisions to accept or reject customers 

at their arrivals, and try to maximize long-term overall revenue while respecting various resource 
constraints. Another example is the so-called AdWords problem, also known as the display ads prob- 
lem. From keyword search queries arriving online, the problem is to sequentially allocate ad slots 
to budget-constrained bidders/ advertisers. Similar problems appear in online routing problems, 
online packing problems, online auctions, and various internet advertising display applications. 

In this paper, we consider a general online linear programming that covers many of the examples 
mentioned above. To be precise about the problem, we need to introduce some notations. Let I be 
a set of m resources; associated with each resource i € / is a capacity b{. The set of resources and 
their capacities are known ahead of time. Let J be a set of n customers; each customer has a set of 
options Oj and arrival time tj. We assume that every customer has a bounded number of options, 
i.e. there exists a constant q such that \Oj\ < q for all j. Each option o € Oj has a value iTj and 
requires a,ij units of resources i for each i £ I, also written aj Q as a vector of dimension m. The 
set of options Oj and associated (irj ,a.j ) are revealed at time tj when customer j arrives. Upon 
arrival, the online algorithm must decide immediately and irrevocably whether or not to satisfy the 
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customer, and, if yes, which option to choose. The goal is to find a solution that maximizes the 
overall revenue from customers while respecting resource constraints. More precisely, we consider 
the following linear program: 

max J2j J2oeOj ^joXjo 

S-t- j O-ij o% j o ^ii 

where Vj,7Tj G (0, l]l°J'l,aj G [0, l] mx l°il, and b G Rip. In the online 

version of this problem, 

(7Tj,aj) is revealed only when customer j arrives at time tj. Upon that arrival, and constrained by 
irrevocable decisions xy made for customers arriving earlier, the online algorithm must then make 
decisions Xj Q , such that 

Y!,j':t j ,<t j J2oeO jf a ij'o x j'o < h, Mi 

Eoeo, x i° ^ 1 ( 2 ) 
Xjo > 0, Mo G Oj 

The goal is to choose the variables x such that the objective function ^ ■ ^ogo ^joXjo is maximized. 

Several models (on how online instances are chosen) can be used to evaluate online algorithms, 
including the adversarial model, the i.i.d. model with or without knowledge of distributions, and 
the random permutation model. In the adversarial setting, no further assumption is made on 
the model. In that case, no online algorithm can achieve better than 0(l/n) fraction of the 
optimal offline solution [3]. However, as the adversarial setting is too conservative, it is natural to 
consider stochastic models. In the i.i.d. model with known distribution about future customers, 
positive results have been obtained for various problems. For many practical problems, such an 
assumption may be too strong, and the i.i.d. model without knowledge of the distribution would 
be more suitable. A weaker model, but easier to analyze, the random permutation model, has been 
considered more frequently. In that model, the order of customers is a uniform random permutation, 
and many near-optimal results have been obtained for it. 

In this paper, we propose a new model, closer to the random permutation model, but removing 
a fundamental, yet practically questionable, assumption behind it. In all results using the random 
permutation model, the exact knowledge about the total number of customers to come is a key 
assumption, essential for ensuring near-optimality results. Without such information, no non-trivial 
result can be achieved. In many practical settings, including all the applications discussed above, 
this assumption is however far from being realistic. We consider instead a more realistic and natural 
setting, initially using the following two assumptions (the consequences of the relaxations of these 
two initial assumptions will also be considered in our paper): 

Assumption 1. Customers have i.i.d. random arrival times. 

The assumption is reasonable in many practical problems where customers' arrival rates are homo- 
geneous throughout time. Ignoring the specific arrival times, the order of customers is essentially 
equivalent to the random permutation model. Later in the paper, we will relax the assumption and 
take heterogeneity of arrival rates into account. 

Assumption 2. The distribution governing random arrival times is known to the online algorithm. 

The assumption is necessary to estimate the total number of customers in case no past data is 
available. However, as discussed in Section IH if a limited amount of past data is available, this 
assumption is not needed anymore. 
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For simplicity in the presentation of this paper, we make two additional technical assumptions, 
which can be removed without compromising the validity of our results, as we explain below. 

Assumption 3. The arrival time is modeled as a continuous random variable. 

No matter what the nature of the original random variable is, we can add an auxiliary random 
variable tj, uniformly distributed between [0, 1] for every customer j upon his arrival. We define a 
total ordering on pairs (tj,tj) based on lexical order. Note that the order of customers is preserved 
except for those who arrive exactly at the same time. The artificial ordering imposed on these 
customers does not help an online algorithm. 

Assumption 4. There are no degeneracies among all points {(^jo, a io)}j,o and (0,0), i.e. no m+2 
points share the same m-dimensional hyperplane. 

If this is not the case, we can introduce a random perturbation on irj : every iTj is multiplied by 
an i.i.d. random variable uniformly distributed between [1, 1 + e]. After the perturbation, there are 
no degeneracies almost surely. On the other hand, because the perturbation is small enough, the 
optimal value of the solution is affected by no more than a multiple factor of 1 + e. 

1.1 Our Techniques and Contributions 

The online algorithms proposed in our paper share similar ideas with some other papers [6j[2j[7 : 
the algorithms first observe (without making any allocation) customers arriving early over a given 
period of time, and solve an offline LP problem over those customers. The corresponding optimal 
dual solution then works as a pricing mechanism for making online allocations on the following set 
of customers. The dual prices are updated from time to time to depict customers' preference more 
accurately as time moves along. We prove that such algorithms are 1 — e competitive under several 
different scenarios if resource capacities are large enough. 

Our paper significantly improves previous results by removing the need to know a priori the 
number of customers n, a critical assumption in [6 [2 [7!. To the best of our knowledge, this is the 
first attempt to do so. As pointed out in several papers, knowing n is so essential that no near- 
optimal online algorithms can be obtained even under a probabilistic version of that assumption. 
So a new model, with near-optimal online algorithm aspiration, would need to introduce alternative 
assumptions. 

We believe our model with arrival time fits reality better: In practice, the setting that an 
online problem is more likely to face typically involves a known fixed period of time over which the 
customers are considered, rather than a known fixed number of customers to come. The question 
that a company usually asks is how to maximize revenue over a given period of time instead of how 
to maximize revenue over a given fixed number of future customers. The arrival time of a customer 
is also more natural and informative than his rank order. Furthermore, our model is more flexible 
as it allows, depending on specific applications, various extensions which can better fit real-world 
scenarios. For example, in airline revenue management problems, business customers and casual 
customers have different price-sensitivity and arrival time. The random permutation model cannot 
capture the heterogeneity among customers well. In contrast, our model can easily be extended to 
such scenarios, as demonstrated in Sectional 

We first consider problems where the distribution of arrival time is known in advance. Although 
similar in form, the previous approaches with fixed number of customers do not address our model 
well. One could first estimate the number of total arrivals in the early stage, and then use the 
estimation for the fixed-number algorithm as in [2j. However, the performance of this approach 
depends on the quality of the estimation. In order to keep the loss due to the estimation below 
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e-fraction, the estimation error must be within e-fraction. According to concentration laws, it 
requires the total number of customers be at least 0(l/e 3 ). Noting that bi's are only required to 
be 0(l/e 2 ), this new requirement on the total number of customers is quite restrictive. On the 
other hand, our approach works for any number of customers, even if the number is smaller than 

0(l/6 2 ). 

We then consider two scenarios for which our initial assumptions are relaxed. In the first 
scenario, we do not assume the knowledge of the exact distribution, but some past observations 
instead. Instead of estimating the cumulative distribution function (CDF) on every point, we only 
make estimation on only a few critical points. This approach requires much less data points than 
the naive one. In the second scenario, we consider heterogeneous customers. 

1.2 Literature Review 

Inspired by applications such as advertisement display, online matching, and allocation problems 
have been recently studied extensively in the operations research and computer science communi- 
ties. Three different models have been considered: adversarial model, i.i.d. model, and random 
permutation model. 

In an adversarial model, no information is known to online algorithms about customers or 
requests. Karp et al. [12] consider a bipartite matching problem, present a best possible algorithm, 
RANKING, with a competitive ratio of 1 — 1/e. Aggarwal et al. [lj propose a 1 — 1/e-competitive 
algorithm for a vertex-weighted version of the same problem. Mehta et al. |15j and Buchbinder et 
al. [5] propose two different best possible algorithms for the Adwords problem. 

In the random permutation model, the set of customers is still unknown to the online algorithm, 
but the order in which customers arrive is a uniformly random permutation. Goel and Mehta [9] 
prove that a greedy algorithm is 1 — 1/e competitive for the AdWords problem. Devanur and 
Hayes [6] present a near optimal online algorithm for the same problem under mild assumptions. 
More recently, Agrawal et al. [2] and Feldman et al. [7j propose near optimal algorithms, based 
on similar ideas, for general linear programming problems and packing problems. Mehdian and 
Yan [13] and Karande et al. [11] simultaneously prove RANKING algorithm is 0.696-competitive 
for the bipartite matching problem. Mirrokni et al. [16] propose an algorithm that works well for 
the AdWords problem in both adversarial and random permutation model. 

In the i.i.d. model, customers or requests are drawn repeatedly and independently from a known 
probability distribution. Feldman et al. [8] present a 0.670-competitive algorithm for a bipartite 
matching problem. Manshadi et al. [14] give a 0.702-competitive algorithm for a slight variation of 
the same problem. Jaillet and Lu [10] improve both these bounds to 0.729 and 0.706, respectively. 

Organization: The rest of the paper is organized as follows. In Sections [2] and [31 we present online 
algorithms for our basic model and prove that they are near optimal under mild conditions. In the 
following two sections, we consider situations where we can remove assumptions imposed on the 
model: the assumption on the knowledge of arrival distributions in Section 0] and the assumption 
on the homogeneity of customers in Section [3 In both sections, we propose and prove near optimal 
online algorithms. 

2 One-Time Learning 

Let F(-) be the cumulative distribution function of the random arrival time of customers. Assump- 
tion [3] ensures that its inverse is well defined. Consider S e = {j : tj < F~ 1 (e)}, the set of 
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customers arriving earlier than i ?_1 (e). From Assumption [H every customer belongs to S e with 
probability e. 

The online algorithm observes customers in S e , rejects them all, and then computes dual prices 
by solving the following primal dual LPs: 

max YljeScoeOjKjoXjo min - e)ebiPi + J2jeS c 1j 

s - t - Y^jeS^oeOj a ijoXjo < (1 - e)e6», Vi s.t. Y^i a ijoPi + Qj > Kjo,Vj £ S e ,o £ Oj , , 

£j > 0, Vj G 5 £ , o G O, > 0, Vj G 5 £ 

Let x and p be the optimal primal and dual solutions for these problems. 

For customers arriving later, they are accepted if payments exceed the threshold set by p: 

{1, if it jo - V] (HjoPi > max{7r JO / - V] ai JO 'Pi, 0} 
i V (4) 

0, otherwise 

The proposed online algorithm is then as follows: 



Algorithm 1 Online Learning Algorithm (OLA) 

1: Reject all customers arriving earlier than i ?_1 (e). 
2: Let xj = Xj(p) for customers arriving after i ?_1 (e). 



We call a customer j degenerate if the maximizer for max {-7Tj — Yli a ijoPi} is not unique or 
TTj — Yli a ijoPi = 0. Degeneracies may lead to undesired results. Fortunately, due to Assumption!!! 
there are at most m + 1 degenerate customers, and all of them, if any, are in S e . For degenerate 
j, the decision rule xj(p) = 0. For non-degenerate j, using complementary slackness, xj(p) equals 
the optimal solution xj to LP ([3]), as stated in the following lemma (see proof in appendix): 

Lemma 1. For non- degenerate j G S e , Xj = Xj (p) for all o G Oj. 

In this paper, we repeatedly use concentration laws to show that some undesired events rarely 
happen. In particular, we use Bernstein inequalities: 



Bernstein Inequalities |4j: Let Xi,...,X n be independent zero-mean random variables. Sup- 
pose there exists M > such that \Xi\ < M almost surely for all i. Then, Vi, 

n t 2 12 

Pr(gX,> t )<exp(- aEW ( +Mt/3 ). 

Toward the analysis of our online algorithms, we first show that the resulting solution is feasible 
with high probability: 

Lemma 2. 7/minj 6j > 5mln(nq/e)/e 3 , then w.p. 1 — e, ^ • a,ij %jo(p) < °i f or M i- 
Proof. From Lemma [H we have 

^ aij x jo (p) < ^2 aijoXjo < e(l - e)&;. 
jes e ,o jes e ,o 
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We would like to apply Bernstein inequalities to show that Ej o a ijoXjo(p) > bi rarely happens. 
The difficulty is that the random variables {a,ij Xj (p)}j depend on the realization of S via p. To 
get around the issue, let us first fix p and i, and consider the event 

i^2 a ijoXjo(p) > ^2 a ijoXjo(p) < (1 - e)ebi} (5) 

3,o j&S e ,o 

For every customer j, because the arrival time is uniformly distributed between [0,T], j € S t with 
probability e. Hence, 

E t a ijoXjo(p)\ = e • a ijo x jo (p)] 

j&S e ,0 j,0 

Using Bernstein's inequalities, we have 

Pr (Ej,o a i?° x i°(p) > bi,YljeS e ,o a ijo x jo(p) < (1 - e)e&j) 

< Pr (Ej,o <HjoXj (p) > Ei6S.,o a ijo^jo(p) - EEjes e ,o a iio x jo (p)] < -e 2 6 t ) 

< exp(-e 3 6j/4) < e/mn m . 

According to [I7j, M^' can be divided into no more than (nq) m regions such that all p in a region 
lead to the same x(p). By taking union bounds over all possible p and i, we have with probability 
e, there exist i and p such that ([5]) is true. So: 

(p) > h) 

3,0 

= Pr(3i, E aijoXjoip) > k, E aijo^jo(p) < (1 - e)e&i) 
< Pr(3i,p,Eaij x io (p) > bi, E 

(p) < (1 - e)ebi) < e. 

j,o jeS t ,o 

By taking the complement, we conclude the lemma. □ 

After obtaining feasibility, we now compare the online solution with the offline optimal solution 
OPT. Note that OPT is the optimal solution to LPs: 



max Ysj^joXjo min V, /,,/,, • E ; '/.; 

S - t - Ej,o a ijoXjo <bi,Vi s.t. J2i a ijoPi + Qj > n jo , Vj, o 

E o€ o J x jo< 1 ^j Pi>0,Vi 

x jo >o,vj,o gj>o,Vj 



(6) 



We now show that the objective value of online solution x(p) is close to OPT: 
Lemma 3. If mim, 6j > 5mln(nq/e)/e 3 , then w.p. 1 — e, 

^7r io x io (p) > (1 - Se)OPT (7) 



3,0 

Proof. Let us consider the following LP: 



max Y^j^KjoXjo 
S.t. ^^j o^ijoXjo bi, Vi 

x jo > 0, Vj, o 



(8) 
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where bi = ]T^ a ijo x jo (p), if pi > and hi = max{^ a ijo Xj (p),bi}, if pi = 0. By complementary 
slackness, x(p) is the optimal solution to this LP. 

We then show that with probability 1 — e, 6j > (1 — 3e)6j,Vl For i such that pi = 0, it is 
trivially true from the definition of 6j. For i such that pi > 0, by complementary slackness, we have 
SjeS,. o^joXjo = (1 — e)ei>j. Furthermore, according to Assumption H] and LemmaCfl at most m + 1 
different j make £j (p) 7^ %o- Noting that irj G (0, 1], we have 

^ 7Tj Xj (p) > ^ vr J0 Xj - (m + 1) = (1 - e)ebi - (m + 1) > (1 - 2e)ebi 

jeS<:,o jGS e ,o 

Using the same technique as in the proof of Lemma [21 we can show that 

(p) > (1 - 26)66,, ^ 7r,- x i0 (p) < (1 - 3e)6 4 ) < e. 

j&S e ,o jG5,o 

Hence, with probability 1 — e, bi > (1 — 3e)6j, Vi 

In that case, we argue that ^ ■ 7Tj Xj (p) > (1 — 3e)OPT. In fact, assuming x* is the optimal 
solution to LP ©, then (1 - 3e)x* is feasible to LP ©. As the optimal solution to LP (|SJ), x(p) 
is no worse than (1 — 3e)x*, which concludes the lemma. □ 

Note that the left hand side of ([7]) includes revenue from customers in S e , which should be excluded. 
It is upper bounded by OPT e , the optimal solution to LP ([3]). 

Lemma 4. E[OPT € ] < e ■ OPT. 

Proof. Note that the optimal dual solution (p*,q*) to ([6]) is also feasible to the partial dual prob- 
lem ©. Hence, the optimal solution to ©: OPT e < (1 - e)eY Ji biP* + Ej G s<?j- B Y taking 
expectation on both sides, we can conclude our lemma. □ 

We are now ready to prove the main theorem. 

Theorem 1. //minjftj > 5mln(ng/e)/e 3 , OLA is 1 — 0(e) competitive. 

Proof. From the lemmas above, with probability 1 — 2e (denoted by S\), the solution x(p) is feasible 
and Ej,oVjo{f>) > (1 - 3e)OPT. Then, 

EEjg&.oeO,- Vj»(p)1 ^ ®\Ej4ts. i0 eo J *3°p3o(p)\£i] -Pr(5i) 

> ( E Ei/^(p)l^] " E[OPr e |f x ]) Pr(£ x ) 

> (1 - 3e)(l - 2e)OPT - eOPT > (1 - 6e)OPT 

□ 

3 Dynamic Pricing Algorithm 

The basic idea of OLA is to compute dual prices for resources, based on customers who arrive early. 
However, because of the limited number of customers, minj bi is required to be as large as 0(l/e 3 ) 
to have a small error probability. A natural question is if sampling more customers can help. The 
answer is affirmative as showed in this section. 
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Let e = 2 E ', where E £ N. Let Si = {j : tj < F be the set of customers arriving no later 

than £ L = {e, 2e,4e, ...}). Let pi denote the optimal dual solution to the following partial 

LPs: 

max Y^jeSi,oeOj ^joXjo 
s± - EjG^.oGO, a ijo x jo < (1 - hi)lbi, Vi ^ 

'Eoeo j x io<l, yj&Si 
x jo >0, VjeSi,oeOj 

where hi = e^l/l. 

Unlike OLA, DPA updates dual prices multiple times to have better performance: 



Algorithm 2 Dynamic Pricing Algorithm(DPA) 
1: Reject all customers arriving earlier than eT. 
2: Update dual prices pi at time eT, 2eT, 4eT, ... 
3: Let Xj = Xj(pi) for customers arriving between IT and 21T. 



The analysis of DPA is very similar to the one of OLA. We show that with high probability, 
the resulting solution is feasible, the resulting solution is near optimal, and the loss caused by 
observation process is small. Because of the lack of space, proofs are omitted here and can be 
found in the appendix. 

Lemma 5. If rn.rn.ibi > 10m ln(nq / e) / e 2 , then w.p. 1 - e, Ylj£S 2l \S h oeOj a ijoXjo(pl) < 

Lemma 6. //mim&j > 10m\n(nq/e)/e 2 , then w.p. 1 - e, X^eS a i,oeOj ^j^joipl) > (1 - 2/i; - 

c)OPT 2l .yi. 

Lemma 7. Let OPT t be the optimal value to then E[OPT t ] < I ■ OPT. 

Combining Lemma El El and [TJ we conclude the main result: 

Theorem 2. //minj&j > 10mln(ng/e)/e 2 , then DPA is 1 — 0(e) competitive. 

4 Learning From the Past 

The previous two sections discuss problems where the distribution of customers' arrival time is 
known to the online algorithm ahead of time. However, the assumption may not be true in many 
applications. Instead, the online algorithm is more likely to have access to past data rather than 
the exact distribution. For example, from observation on previous days, a retail store owner may 
expect that roughly two-thirds of the customers arrive in the afternoon. Specifically, in this section, 
we assume customers have i.i.d. arrival time with unknown distribution. Furthermore, information 
about the k past customers {^j^; 71 "^} ^ given to the online algorithm. The algorithm proposed 
in this section only uses arrival times of past customers. 

Intuitively, by concentration laws, the distribution /(•) can be estimated arbitrarily well point- 
wise as k grows. However, point-wise accuracy is unnecessary for our algorithm, and requires a 
huge amount of data. Note that, only at time F _1 (e), F~ 1 (2e), F _1 (4e), ... does DPA update its 
pricing policy. Thus, if we could estimate those quantile points well, we would expect the resulting 
algorithm has similar performance as DPA. 

First, let us show that t' lk is a good estimate of To be more precise, 
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Lemma 8. If k > 51n(l/e)/e 2 , w.p. 1 - e, F'\(l - h t )l) < t' lk < F' 1 ^ + hi)l),Vl G L = 
{e,2e,4e, ...}. Here hi = e^/l/l. 

Proof. Let iVi be the number of customers arriving between [0, - hi)l)}. Then, 

Pr(iV; 1 > ifc) = Pr(iV; 1 - E[Nl\ > h t lk) < exp(-e 2 /c/2). 

Let Nf be the number of customers arriving between [0, -F -1 ((l + h t )l)]. Then, 

Pr(Af < Ik) = Pr(Af - E[iV/] < -h t lk) < exp(-e 2 /c/4). 

Noting that A^ 1 > Ik is equivalent to t' lk > — h{)l) and A" 2 < Ik is equivalent to t' lk < 

^((l + hi)l). Therefore, by union bound, F~ 1 {{1 - hi)l) < t' lk < ^((l + hi)l),Vl = e,2e,4e, ... 
w.p. 1 - 21n(l/e)exp(-e 2 £;/4) > 1 - e. □ 

After obtaining estimates of F _1 (l), let us present the online algorithm DPAD. The only difference 
from DPA is that instead of updating at DPAD updates its pricing policy at t', k . 



Algorithm 3 Dynamic Pricing Algorithm with Data(DPAD) 
1: Reject all customers arriving earlier than t' ek . 

2: Update dual prices pi at time t' ek ,t' 2ek ,t' 4ek , ... according to LP (fl"0|) given below 
3: Let Xj = Xj(pi) for customers arriving between t' lk and t' 2lk . 



max J2j^Si,o&o J 7T jo x jo 

s -t- Y.jeSuoeOj a ijoXjo < (1 - Ghi)lbi, Mi 

J2oeo ] x io<^, Vj€5j 

Xj > 0, VjeSi,oeOj 

where hi = e^/l/Z and Si = {j : tj < t' lk }. 

Let event £ es t denote the event where -F -1 (e), F~ 1 (2e), -F _1 (4e), ... are well-estimated as in Lemma[HJ 
Given £ es j, we would expect DPAD has many similar properties as DPA. Indeed, it is the case, 
and the analysis is almost identical. We show that with high probability, the resulting solution is 
feasible, the resulting solution is near optimal, and the loss due to observation is small. 

Lemma 9. Given £ es t, z/muij bi > 3mln(nq/e)/e 2 , then with probability 1 — e, 

y j aij Xj {t>\) < lbi,Vi,l. 

Lemma 10. Given £ es t, i/muij bi > 3mln(nq/e)/e 2 , then with probability 1 — e, 

*joXjo(f>l) > (1 - $hi - e)OPT 2l ,Vl. 

jGS 2 i,oeOj 

Lemma 11. Given £ est , E[OPTi] < (1 + hi)l ■ OPT. 
From lemmas above, we can conclude: 

Theorem 3. //minj&j > 3mln(nq/e)/e 2 and k > 51ne/e 2 ; the algorithm is 1 — O(e) competitive. 
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The assumptions made in the theorem are reasonable. On the one hand, the lower bound on minj 6, 
is the same as in DPA, which has been showed to be best possible in many occasions. On the other 
hand, the lower bound on k is even lower than the one on minjfoj, which means only a limited 
amount of past observations are required to obtained the near-optimal result. 

DPAD does not take advantage of demands and prices information from past customers. In 
the random permutation model, such information is unlikely to improve the online algorithm. But 
for practical problems where demands and payments come from unknown i.i.d. distributions, these 
data may provide good estimation on the distributions, and lead to better results. 



5 Heterogeneous Customers 

As we may see in many applications, customers are not all homogeneous. Customers with differ- 
ent preference may have different arrival time. For instance, in the airline revenue management 
problems, casual travelers, whose reserve prices are probably lower, usually arrive long before their 
scheduled departure time; while business travelers, who tend to be price-insensitive, are more likely 
to appear shortly before intended trips. In this section, we take this heterogeneity into account. 

Assume all customers are categorized into K groups: N = HaLi ^k- Furthermore, we assume 
there exists a constant c such that Mk,k',t,Fk(t) < cF^(t). Let to be the e-quantile point, i.e. 
Fi{po) = e - Assume F^ito) = r k e. From the assumption on the CDFs, we have r k £ [1/c, c]. 

Let Sk be the set of customers from group k that arrive before to. The online algorithm observes 
customers arriving before to and solves the following LPs: 

max Efcfae) -1 Lies* fjosfo . n ±v k 

s-t. E fc (r fe e) E^W, <e(l-e)6i, Mi g t r fe e^ + E* 4joPi > > Y?> °> k (U) 
l^oe0 3 x jo < ^j,k p,q>0 
x > 



Similar to arguments in the previous sections, We show that with high probability, the resulting 
solution is feasible, the resulting solution is near optimal, and the loss due to observation is small: 

Lemma 12. //mini h > 3cmln(nq/e)/e 3 , then w.p. 1 - e, Mi, E fc Ej|^ a ijo x jo(p) ^ 

Lemma 13. If mm b { > 3cmln(nq/e)/e 3 , then w.p. 1 - e,E* EjS ^joip) > ( l ~ 3e)OPT. 

Lemma 14. E[E fc Eje§ rfo x jo] ^ max k r k eOPT < ceOPT. 
Combining the three lemmas above, we can conclude that: 

Theorem 4. 7/minj6j > 3cmln(ng/e)/e 3 , the algorithm is 1 — 0(e) competitive. 

Unfortunately, dynamic pricing techniques used in DPA and DPAD do not apply for this problem, 
because the arrival process is not homogeneous here. Without dynamic pricing mechanism, in order 
to have near optimality result, the lower bound imposed on the model is much higher than the one 
we obtained in the previous sections. Worth noting that, this approach can only deal with problems 
where customers of each type are well represented in the early stages. Otherwise, we may need 
additional assumptions of information to obtain good results. For example, consider airline tickets 
sales, if all casual travelers arrive at least one week before departure and all business travelers only 
appear one week within departure. If the numbers of travelers of the two types are unrelated, 
then past information alone is unlikely to help us decide how many seats to reserve for business 
customers. To find a proper reserve level, we need good estimates on the numbers of travelers of 
the two types, which probably requires more assumptions. 
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A Proof of Lemma [T] 

Lemma 1. For non- degenerate j G S e , Xj Q = Xj Q {p) for all o G Oj. 

Proof. If there exists o G Oj such that Xj Q > 0. From complementary slackness, we have Y^i a ijoPi + 
<jj = 7Tj . Since Vo' G Oj, 

^'o' ~~ ai i°'^ < = ^jo - aijoPi) 

i i 

we have Xj D (p) = 1 and > 0. Since j is non-degenerate, then gj > 0. Combined with complemen- 
tary slackness, we have Xlo'eOj x j°' = 1- Note that Vo' ^ o, x JO / = because nj a i — J2i a ijo'Pi < Qj- 
Hence, £j = 1 = Xj (p). 

If £j = for all o £ Oj, then <?j = 0. Since -Kj — Ylii a ijoPi — Qj an d 3 is non-degenerate, 
TTj — CLij pi < 0. Therefore, Xj a (p) = for all o £ Oj. □ 

B Omitted Proofs in Section [3] 

Lemma 5. Z/mirij&j > 10m \u(nq/ e) / e 2 , then w.p. 1 - e,Y^j e s 3l \S h o£Oj a ijo x jo(Pl) < lk,Vi,l 

Proof. The proof is very similar to the one of Lemma [2j Let us first consider the probability of 
event 

oeOj o^Oj 

{ ^ aij Xj (pi) < (1 - hi)lbi, ^2 a ijo Xj (pi) > Ibi} (12) 

jeS 2 i\Si 

for all fixed I, p\, and i: 



Pl (EjeSifi€O s a ijoXjo(pi) < (1 - h l) l h,T,jeS 2l \s l ,oeO j aij x jo (pi) > Ibi) 
< ^(EjeSuoeOj a ijoXjo(P\) < (1 - hi)lk, £ ieS2ii0e0j . ^joX jo (p\) > (2 - 
+ fMEjeSaA^oeO, a; io x jo ( P i) > H*, £ i6%)0e0 ^ a iio x io ( Pl ) < (2 - 



Note that Pr(j G G S21) = 1/2. From Bernstein inequalities, the first term is upper bounded 
by exp(— e 2 k/10). Similarly, the second term is also upper bounded by exp(— e 2 6j/10). Thus, 

oeOj oeOj 

Pr( ^ aijoXj (p\) < (1 - h)lbi, ^2 a 'ijoXj (p\) > Ik) < 2 exp(-e 2 6j/10). 

j^Si jes 2l \s t 

Note that for each I, there are at most (nq) m distinct pi regions. By union bounds, we have that 
with probability e, there exist i, I, and pj, such that (|12[) is true. On the other hand, from Lemma[TJ 

we have X)j eS , )0e0j a-ijoXjotpi) < Zj G s/ a ijoXjo < (1 - B y letting pi = pi, we can conclude 

our lemma. □ 

Lemma 6. If minj k > Wmln(nq/e)/e 2 , then w.p. 1 — c,Y^jes 2 i,aeo' !r j° x jo(Pl) > (1 — — 
e)OPT 2l ,Vl. 

Proof. Let us consider the following LP: 

max ^j^S 2 i,o 7T jo x jo 

x JO > 0, ViG5 2 /,o 



11 



where k = 52 j€ s 2l a ijo x jo(f>i), if Pi,i > and b{ = maxEje2Z a ijo%jo(f>i), k}, if Pl,i = 0. By 
complementary slackness, x(pi) is the optimal solution to this LP. 

On the other hand, using the same argument as in the proof of Lemma El we can show that 
with probability 1 — e, 6j > 21 ■ (1 — 2h\ — e)b{ for all i and /. In that case, YljeS2i,o 7r 3 oX 3° (Pi)> 
(1 - 2h t - e)OPT 2l . □ 

Lemma 7. E[OPT{\ < I ■ OPT. 

Proof. Consider the optimal dual solution (p* , q* ) to LP (pQ) . We can easily check that it is feasible 
to the dual problem of LP ([9]) : 

min Y.ii 1 ~ h)ld>iPi + EjeS* 1i 
s.t. Yji a ijoPi + Qj > njo, Vj eS|,o£ Oj 
Pi > 0, Vi 
g-j > 0,Vj G 5j 

Thus, OPT) < (1 - hi)lY^ibiP*i + EjeS, Note that for ever y Pr 0' e = 1 B V taking 
expectation on both sides, we can conclude the lemma. □ 

Theorem 2. //minj&j > 10mln(nq/e)/e 2 , then DPA is 1 — 0(e) competitive. 

Proof. Let £"2 denote the event that both inequalities in Lemma [5] and [6] are true, then Pr(£ 2 ) > 
1 - 2e. 

EE E w(pi)I£j] 

ieLjes 2 i\Si 

> E e[ E - E e[ E w(pi)I&] 

> E (1 " 2/i/ - e)E[OPT 2 ^ 2 ] - E E[OP7]|£ 2 ] 
zeL zeL 

> OPT - E 2/ijE[OPr 2I |£ 2 ] - e E E[OPT 2 ^ 2 ] - E[OPT £ |£ 2 ] 

leL leL 

> OPT - 4 E W • C-P 7 - 2e E ^ OPT - eOPT 

ZeL ZeL 

> OPT - UeOPT 

Therefore, E[E, eL £ i6SalV?J wfo)] > (1 ~ 15e)OPT. □ 

C Omitted Proofs in Section |4] 

Lemma 9. Given £ es t, z/minj 6j > 3mhi(nq/e)/e 2 , i/ien mf/i probability 1 — e, 

aijo^jo(pi) < lbi,Vi,l. 

Proof. Fix p, i, and /. Let Xj = EoeOj a ijoXjo(v)- Then, 

Pr( E Xj > Ih, E Xj < (1 - 6/n)Zfei) 

J6S 2I \Sj j£S t 

< Pr( E < (1 - 6/t z )Z6i, E X j > 2(1 - 3^)^) 
jeS; jes 2i 
+ Pr( E Xj > lb,, E Xj < 2(1 - 3hi)lbi) 

jeS 3l \Si jes 2 i 
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Since Pr(j G S t \j G S 2t ) = F(t' lk )/F{t' 2lk ) < (1 + 2h t )/2, the first term 
Pr( E Xj < (1 - 6/n)l6i, E X,- > 2(1 - 3fc,)«6i) 

< Pr( E *j " E[ E *j] < min{-/ii E *iA -/*A}, E *j > 2(1 - 3/i/)^i) • 

jes t jeSi jes 2i jes 2t 

< exp(-e 2 6i/3) 

Since Pr(j G G S a j) = 1 - Pr(j G S t \j G S 2i ) > (1 - 2/»,)A the second term 

Pr( E Xj > lb h E Xj < 2(1 - 3hi)lbi) 

j€S 2 i\Si jes 2l 

< Pr( E ^ - E[ E Xj] > hilbi, E Xj < 2(1 - 3/»,)i&i) • 

ieS 2I \Si jes 2i \.S; jes 2; 

< exp(-e 2 6 i /3) 

Therefore, Pr(E 3 -eS 2 j\$ > lb i^jeS t X j < (1 - §h)lbi) < 2 exp(-e 2 6j/3). By taking union 
bounds over all possible Pi, i, and Z, we can conclude the lemma. □ 

Lemma 10. Given £ es t, fc/niim 6j > 3mln(nq/e)/e 2 , then with probability 1 — e, 

KjoXjoiPl) > (1 - 9fy - e)OPT 2h Vl, 

Proof. Let us consider the following LP: 



max EjGS 2i ,o KjoXjo 
s -t- Ej'e<S2!,o a ijo x jo ^ ^ii 

x JO > 0, VjeS 2 i,o 



(14) 



where 6j = EjeS 2i a ijo^o(Pi), if > and 6; = maxfE^/ %o^jo(pl), k}, if = 0. By 
complementary slackness, x(pi) is the optimal solution to this LP. 

On the other hand, using the same argument as in the proof of Lemma [31 we can show that 
with probability 1 — e, 6j > 21 ■ (1 — 9hi — e)bi for all i and I. In that case, then EjeS 2i o 7T joXj (p\) > 
(l-9/i,-6)OPT 2! . ' " □ 

Lemma 11. Given £ est , E[OPTi] < (1 + ft/)* ■ OPT. 

Proof. Consider the optimal dual solution (p* , q* ) to LP ([TJ . We can easily check that it is feasible 
to the partial dual problem ([9]) 

min Ei(f - 6hi)lebiPi + Ej e s ; H 

S.t. Ei a ijoPi + Qj > Kjo, Vj G Si,0 G Oj 

Pi > 0, Vi 
<y>0,Vj€5, 

Thus, OPT; < (l-6fci)iEi6iPi+Ei65,^- Note that given £ est , for every j,Pr(j G Sj) < + 
By taking expectation on both sides, we can conclude the lemma. □ 

Theorem 3. //minj&j > 3mln(nq/e)/e 2 and k > 51ne/e 2 ; the algorithm is 1 — O(e) competitive. 
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Proof. Let £ 3 denote the event that both inequalities in Lemma [9] and [10] are true, then Pr(<? 3 n 
Zest) > 1 - 3e. 

E[E E vr.x^pOl^n^] 

leLjes 2l \s t 

> E E[ E w(Pi)|fs n £ e *t] - E E[ E w(pi)|£ 3 n £ est ] 

zeL jes a j zeL jeSi 

> E (1 - 9/*j - e)E[OPr 2{ |f 3 n £ est ] - E e[opt4£ 3 n £ est ] 

leL leL 

> OPT - E 9hE[OPT 2l \£ 3 n £ est ] - e E E[OPT 2J |5 3 H £ est ] - E[OPT e |£ 3 n £ es t] 

zeL zeL 

> OPT - 9 E (27z z Z + • OPT - e E 4/ • OPT - 2eOPT 

zeL zeL 

> OPT - 42eOPT 

Therefore, E[E, eL £ i6SaASj W(Pi)] > (1 ~ 45e)OPT. □ 

D Omitted Proofs in Section [5] 

Lemma 12. Ifminibi > 3cmln(nq/e)/e 3 , then w.p. 1 — e, Vi, Efc E^gjv a ijo x jo(p) — &t- 

Proof. The proof is very similar to the one of Lemma [2] Let us first consider the probability of the 
event 

oeOj oeOj 

{E^ e ) _1 E 4o4o(p) < < l - ^ E E 4o4o(p) > M (15) 

k jes k k j&N k 

for all fixed p and i. Since Pr(j G G N^) = r^e, we expect Ezc( r fc e )~ 1 Ejg5^ a ijo x jo close to 
its mean Efc E^eiv a ijo x jo- Therefore, event (fTol) should be a rare event. More precisely, 

Pr^fae)" 1 £ oJ- zJ (p) < 6(1 - e)b u ^ E 4°4>(p) > &i) < exp(-6 3 6 i /3c). 

k jes k k j&N k 

Note that there are at most (nq) m distinct p. By taking union bounds over all distinct p and i, we 
can conclude the lemma. □ 

Lemma 13. If mini h > 3cmln(nq/e)/e 3 , then w.p. 1 — e, E& EjgiVfc rforfoip) — 0- ~ 3e)OPT. 

Proof. Let us consider the following LP: 

v^°€Oj k k 
max l^k L,jeN k 7T jo x jo 

S-t- EkE%N k a i30 X jo<^ Vi (16) 



' jo 



4 >o, vj,/%, 



where k = E fe Eje^ ^^(p)' if ^ > and ^ = max iEfc T,jSf[ a ijo x jo(p)i b i}, if Pi = °- B y 
complementary slackness, x(p) is the optimal solution to this LP. 

On the other hand, using the same argument as in the proof of Lemma we can show that with 
probability 1 — e, hi > (1 — 3e)6j for all i and I. If such an event happens, then E/« EjeArl rforfoip) — 
(l-3e)OPT. ' ' □ 
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Lemma 14. EE fe £?eS "'Wl - ma *kr k eOPT < ceOPT. 

Proof. Let OPT e be the optimal value to the partial LP (jlip . Let (p*,q*) be the optimal dual 
solution to the complete LP (fTB"j) . It is easy to check that (p*, q*) is a dual feasible solution to (fTT]) , 
Therefore, E[OPT e ] < OPT. 

On the other hand, for any realization, the lost revenue resulting from the first e fraction 
customers is no more than OPT\. Hence, 

E[J^ V vr^x^J < E[maxr fc e • OPT,] < maxr fc e • OPT. 

k jGSfc 

□ 

Theorem 4. //minj&j > 3cmln(ng/e)/e 3 , i/ie algorithm is 1 — 0(e) competitive. 
Proof. Let £4 denote the event that for all i 

E E 4o4o(P) ^ b * 

and 

oeOj 

E E ^4o(p) > (i - ^)opt. 

k jeN k 

Then, from Lemma [TJ] and [TUI we have Pr(f4) < 1 — 2e. Given £ , the online solution x(p) is 
feasible. Therefore, 

> (EE fc ES4o4o(p)l^]-E[E fc ES^ol^])-Pr(^4) 

> (1 - 3e)(l - 2e)OPT - ceOPT 

> (1 - 0(e))OPT 

□ 
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